Discrete cross-modal hashing with relaxation and label semantic guidance

Teng, Shaohua; Huang, Wenbiao; Wu, Naiqi; Du, Guanglong; Chen, Tongbao; Zhang, Wei; Teng, Luyao

doi:10.1007/s11280-024-01239-6

Discrete cross-modal hashing with relaxation and label semantic guidance

Published: 20 January 2024

Volume 27, article number 4, (2024)
Cite this article

World Wide Web Aims and scope Submit manuscript

Shaohua Teng¹,
Wenbiao Huang¹,
Naiqi Wu²,
Guanglong Du³,
Tongbao Chen¹,
Wei Zhang¹ &
…
Luyao Teng⁴

236 Accesses
Explore all metrics

Abstract

Supervised cross-modal hashing has attracted many researchers. In these studies, they seek a common semantic space or directly regress the zero-one label information into the Hamming space. Although they achieve many achievements, they neglect some issues: 1) some methods of the classification task are not suitable for retrieval tasks, since they are lack of learning personalized features of sample; 2) the outcomes of hash retrieval are related to both the length and encoding method of hash codes. Because a sample possess more personalized features than label semantics, in this paper, we propose a novel supervised cross-modal hashing collaboration learning method called discrete Cross-modal Hashing with Relaxation and Label Semantic Guidance (CHRLSG). First, we introduce two relaxation variables as latent spaces. One is used to extract text features and label semantic information collaboratively, and the other is used to extract image features and label semantics collaboratively. Second, the more accurate hash codes are generated from latent spaces, since CHRLSG learns collaboratively feature semantics and label semantics by using labels as the domination and features as the auxiliary. Third, we utilize labels to strengthen the similar relationship of inter-modal samples via keeping the pairwise closeness. Label semantics are made full use of to avoid classification error. Fourth, we introduce class weight to further increase the discrimination of samples that belong to different classes in intra-modal and keep the similarity of samples unchanged. Therefore, CHRLSG model preserves not only the relationship between samples, but also maintains the consistency of label semantic during collaboration optimization. Experimental results of three common benchmark datasets demonstrate that the proposed model is superior to the existing advanced methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 9

Low-rank tensor learning with projection distance metric for multi-view clustering

Article 26 April 2024

Integrated Heterogeneous Graph Attention Network for Incomplete Multi-modal Clustering

Article 24 April 2024

CA-CLIP: category-aware adaptation of CLIP model for few-shot class-incremental learning

Article 23 April 2024

Availability of Data and Materials

The data and materials used during the current study are available from the corresponding author on reasonable request.

References

Teng, L., Tang, F., Zheng, Z., Kang, P., Teng, S.: Kernel-based sparse representation learning with global and local low-rank label constraint. IEEE Trans. Comput. Soc. Syst. 1–15. https://doi.org/10.1109/TCSS.2022.3227406 (2022)
Ding, G., Guo, Y., Zhou, J.: Collective matrix factorization hashing for multimodal data. 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2083–2090, Columbus, OH, USA (2014)
Zhang, D., Li, W.-J.: Large-scale supervised multimodal hashing with semantic correlation maximization. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, pp. 2177–2183. Quebec, Canada (2014)
Zheng, Z., Teng, S., Wu, N., Teng, L., Zhang, W., Fei, L.: Selected confidence sample labeling for domain adaptation. Neurocomputing 555, 126624 (2023)
Article Google Scholar
Yu, E., Sun, J., Li, J., Chang, X., Han, X.-H., Hauptmann, A.G.: Adaptive semi-supervised feature selection for cross-modal retrieval. IEEE Trans. Multimed. 21(5), 1276–1288 (2019)
Article Google Scholar
Zhang, L., Ma, B., Li, G., Huang, Q., Tian, Q.: Pl-ranking: A novel ranking method for cross-modal retrieval. In: Proceedings of the 24th ACM International Conference on Multimedia, pp. 1355–1364, New York, NY, USA (2016)
Shao, J., Zhao, Z., Su, F., Yue, T.: Towards improving canonical correlation analysis for cross-modal retrieval.In: Proceedings of the on Thematic Workshops of ACM Multimedia 2017, pp. 332–339, New York, NY, USA (2017)
Tang, J., Li, Z., Wang, M., Zhao, R.: Neighborhood discriminant hashing for large-scale image retrieval. IEEE Trans. Image Process. 24(9), 2827–2840 (2015)
Article ADS MathSciNet PubMed Google Scholar
Zhu, L., Shen, J., Xie, L., Cheng, Z.: Unsupervised visual hashing with semantic assistant for content-based image retrieval. IEEE Trans. Knowl. Data Eng. 29(2), 472–486 (2017)
Article Google Scholar
Gu, X., Dong, G., Zhang, X., Lan, L., Luo, Z.: Semantic-consistent cross-modal hashing for large-scale image retrieval. Neurocomputing 433, 181–198 (2021)
Article Google Scholar
Chen, Z.-D., Li, C.-X., Luo, X., Nie, L., Zhang, W., Xu, X.-S.: Scratch: A scalable discrete matrix factorization hashing framework for cross-modal retrieval. IEEE Trans. Circ. Syst. Video 30(7), 2262–2275 (2020)
Article Google Scholar
Lin, Z., Ding, G., Hu, M., Wang, J.: Semantics-preserving hashing for cross-view retrieval. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3864–3872, Boston, MA, USA (2015)
Liu, H., Ji, R., Wu, Y., Huang, F., Zhang, B.: Cross-modality binary code learning via fusion similarity hashing. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6345–6353, Honolulu, HI, USA (2017)
Liu, S., Qian, S., Guan, Y., Zhan, J., Ying, L.: Joint-modal distribution-based similarity hashing for large-scale unsupervised deep cross-modal retrieval. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1379–1388, New York, NY, USA (2020)
Liu, X., Hu, Z., Ling, H., Cheung, Y.-M.: Mtfh: a matrix tri-factorization hashing framework for efficient cross-modal retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 43(3), 964–981 (2021)
Article PubMed Google Scholar
Qin, J., Fei, L., Teng, S., Zhang, W., Liu, D., Zhao, G., Yuan, H.: Discrete semantic matrix factorization hashing for cross-modal retrieval. 2020 25th International Conference on Pattern Recognition (ICPR), pp. 1550–1557, Milan, Italy (2021)
Qin, J., Fei, L., Zhu, J., Wen, J., Tian, C., Wu, S.: Scalable discriminative discrete hashing for large-scale cross-modal retrieval. 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4330–4334, Toronto, ON, Canada (2021)
Tang, J., Wang, K., Shao, L.: Supervised matrix factorization hashing for cross-modal retrieval. IEEE Trans. Image Process. 25(7), 3157–3166 (2016)
Article ADS MathSciNet PubMed Google Scholar
Wang, D., Wang, Q., He, L., Gao, X., Tian, Y.: Joint and individual matrix factorization hashing for large-scale cross-modal retrieval. Pattern Recog. 107, 107479 (2020)
Article Google Scholar
Wang, Y., Luo, X., Nie, L., Song, J., Zhang, W., Xu, X.-S.: Batch: a scalable asymmetric discrete cross-modal hashing. IEEE Trans. Knowl. Data Eng. 33(11), 3507–3519 (2021)
Article Google Scholar
Wu, F., Wu, Z., Feng, Y., Zhou, J., Huang, H., Li, X., Dong, X., Jing, X.Y.: Supervised discrete matrix factorization hashing for cross-modal retrieval. 2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS), pp. 855–859, Nanjing, China (2018)
Xu, X., Shen, F., Yang, Y., Shen, H.T., Li, X.: Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans. Image Process. 26(5), 2494–2507 (2017)
Article ADS MathSciNet PubMed Google Scholar
Zhang, P.-F., Li, C.-X., Liu, M.-Y., Nie, L., Xu, X.-S.: Semi-relaxation supervised hashing for cross-modal retrieval. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 1762–1770, New York, NY, USA (2017)
Zhou, J., Ding, G., Guo, Y.: Latent semantic sparse hashing for cross-modal similarity search. In: Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 415–424, New York, NY, USA (2014)
Wang, L., Yang, J., Zareapoor, M., Zheng, Z.: Cluster-wise unsupervised hashing for cross-modal similarity search. Pattern Recog. 111, 107732 (2021)
Article Google Scholar
Jin, S., Yao, H., Zhou, Q., Liu, Y., Huang, J., Hua, X.: Unsupervised discrete hashing with affinity similarity. IEEE Trans. Image Process. 30, 6130–6141 (2021)
Article ADS MathSciNet PubMed Google Scholar
Teng, S., Ning, C., Zhang, W., Wu, N., Zeng, Y.: Fast asymmetric and discrete cross-modal hashing with semantic consistency. IEEE Trans. Comput. Soc. Syst. 10(2), 577–589 (2023)
Article Google Scholar
Fang, X., Liu, Z., Han, N., Jiang, L., Teng, S.: Discrete matrix factorization hashing for cross-modal retrieval. Int. J. Mach. Learn. Cybern. 12(10), 3023–3036 (2021)
Article Google Scholar
Chen, Y., Zhang, H., Tian, Z., Wang, J., Zhang, D., Li, X.: Enhanced discrete multi-modal hashing: More constraints yet less time to learn. IEEE Trans. Knowl. Data Eng. 34(3), 1177–1190 (2022)
Article Google Scholar
Shen, H.T., Liu, L., Yang, Y., Xu, X., Huang, Z., Shen, F., Hong, R.: Exploiting subspace relation in semantic labels for cross-modal hashing. IEEE Trans. Knowl. Data Eng. 33(10), 3351–3365 (2021)
Article Google Scholar
Wang, D., Gao, X., Wang, X., He, L.: Label consistent matrix factorization hashing for large-scale cross-modal similarity search. IEEE Trans. Pattern Anal. Mach. Intell. 41(10), 2466–2479 (2019)
Article PubMed Google Scholar
Wang, S., Zhao, H., Nai, K.: Learning a maximized shared latent factor for cross-modal hashing. Knowl.-Based Syst. 228, 107252 (2021)
Fang, X., Jiang, K., Han, N., Teng, S., Zhou, G., Xie, S.: Average approximate hashing-based double projections learning for cross-modal retrieval. IEEE Trans. Cybern. 52(11), 11780–11793 (2022)
Article PubMed Google Scholar
Ma, D., Liang, J., Kong, X., He, R., Li, Y.: Discrete cross-modal hashing for efficient multimedia retrieval. 2016 IEEE International Symposium on Multimedia (ISM), pp. 38–43. San Jose, CA, USA (2016)
Zheng, C., Zhu, L., Lu, X., Li, J., Cheng, Z., Zhang, H.: Fast discrete collaborative multi-modal hashing for large-scale multimedia retrieval. IEEE Trans. Knowl. Data Eng. 32(11), 2171–2184 (2020)
Article Google Scholar
Wang, Y., Chen, Z., Luo, X., Li, R., Xu, X.: Fast cross-modal hashing with global and local similarity embedding. IEEE Trans. Cybern. 52(10), 10064–10077 (2022)
Article PubMed Google Scholar
Teng, S., Huang, W., Zhang, W., Teng, L.: The cross-modal hash with tag and sample semantic enhancements. Journal of Jiangxi Normal University( Natural Science) 47(3),296–306 (2023)
Yao, T., Yan, L., Ma, Y., Yu, H., Su, Q., Wang, G., Tian, Q.: Fast discrete cross-modal hashing with semantic consistency. Neural Netw. 125, 142–152 (2020)
Article PubMed Google Scholar
Zhang, W., Yang, X., Teng, S., Wu, N.: Semantic-guided hashing learning for domain adaptive retrieval. World Wide Web (WWW) 26(3), 1093–1112 (2023)
Article Google Scholar
Zhang, D., Wu, X.-J., Liu, Z., Yu, J., Kitter, J.: Fast discrete cross-modal hashing based on label relaxation and matrix factorization. 2020 25th International Conference on Pattern Recognition (ICPR), pp. 4845–4850, Milan, Italy (2021)
Zhang, C., Li, H., Qian, Y., Chen, C., Gao, Y.: Pairwise relations oriented discriminative regression. IEEE Trans. Circ. Syst. Video Technol. 31(7), 2646–2660 (2021)
Article Google Scholar
Teng, S., Zheng, Z., Wu, N., Teng, L., Zhang, W.: Adaptive graph embedding with consistency and specificity for domain adaptation. IEEE/CAA J. Autom. Sin. 10(11), 1–14 (2023)
Google Scholar
Teng, S., Guo, L., Zhang, W., Teng, L.: The cross-modal discrete hash learning of tag embedding subspace. Journal of Jiangxi Normal University (Natural Science) 45(3), 305–313 (2021)
Google Scholar
Zheng, Z., Teng, L., Zhang, W., Wu, N., Teng, S.: Knowledge transfer learning via dual density sampling for resource-limited domain adaptation. IEEE/CAA J. Autom. Sin. 10(12), 1–23 (2023)
Google Scholar
Schönemann, P.H.: A generalized solution of the orthogonal procrustes problem. Psychometrika 31, 1–10 (1966)
Article MathSciNet Google Scholar
Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: Labelme: a database and web-based tool for image annotation. Int. J. Comput. Vis. 77(1-3), 157–173 (2008)
Huiskes, M.J., Lew, M.S.: The mir flickr retrieval evaluation. MIR ’08, Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval, pp. 39–43, New York, NY, USA (2008)
Chua, T.-S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: Nus-wide: a real-world web image database from national university of singapore. CIVR ’09, Proceedings of the ACM International Conference on Image and Video Retrieval, pp. 1–9, New York, NY, USA (2009)

Download references

Acknowledgements

I am very grateful to all those who helps me put these ideas and put them into practice.

Funding

This study is supported in part by the Key-Area Research and Development Program of Guangdong Province under grant 2020B010166006, the National Natural Science Foundation of China under grant 61972102, 62202107, 62176066, and Guangzhou Science and Technology Plan Project under grant 2023A04J1729.

Author information

Authors and Affiliations

School of Computer Science and Technology, Guangzhou Panyu Polytechnic CN, No. 100, University City Outer Ring West Road, Panyu District, Guangzhou, 510006, Guangdong, China
Shaohua Teng, Wenbiao Huang, Tongbao Chen & Wei Zhang
Institute of Systems Engineering and Collaborative Laboratory for Intelligent Science and Systems, Macau University of Science and Technology CN, 999078, Macao, China
Naiqi Wu
School of Computer Science and Engineering, South China University of Technology CN, No. 382, University City Outer Ring East Road, Panyu District, Guangzhou, 510006, Guangdong, China
Guanglong Du
School of Information Engineering, Guangzhou Panyu Polytechnic CN, No. 1342 Shiliang Road, Panyu District, Guangzhou, 511483, Guangzhou, China
Luyao Teng

Authors

Shaohua Teng
View author publications
You can also search for this author in PubMed Google Scholar
Wenbiao Huang
View author publications
You can also search for this author in PubMed Google Scholar
Naiqi Wu
View author publications
You can also search for this author in PubMed Google Scholar
Guanglong Du
View author publications
You can also search for this author in PubMed Google Scholar
Tongbao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Wei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Luyao Teng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luyao Teng.

Ethics declarations

Competing interest

The authors declare that they have no competing interests.

Ethical Approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Teng, S., Huang, W., Wu, N. et al. Discrete cross-modal hashing with relaxation and label semantic guidance. World Wide Web 27, 4 (2024). https://doi.org/10.1007/s11280-024-01239-6

Download citation

Received: 17 September 2023
Revised: 02 December 2023
Accepted: 06 December 2023
Published: 20 January 2024
DOI: https://doi.org/10.1007/s11280-024-01239-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discrete cross-modal hashing with relaxation and label semantic guidance

Abstract

Access this article

Similar content being viewed by others

Low-rank tensor learning with projection distance metric for multi-view clustering

Integrated Heterogeneous Graph Attention Network for Incomplete Multi-modal Clustering

CA-CLIP: category-aware adaptation of CLIP model for few-shot class-incremental learning

Availability of Data and Materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interest

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Discrete cross-modal hashing with relaxation and label semantic guidance

Abstract

Access this article

Similar content being viewed by others

Low-rank tensor learning with projection distance metric for multi-view clustering

Integrated Heterogeneous Graph Attention Network for Incomplete Multi-modal Clustering

CA-CLIP: category-aware adaptation of CLIP model for few-shot class-incremental learning

Availability of Data and Materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interest

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation